![]() |
The Twofish Encryption Algorithm: A 128-Bit Block Cipher
by Bruce Schneier ; John Kelsey ; Doug Whiting ; David Wagner ; Chris Hall ; Niels Ferguson Wiley Computer Publishing, John Wiley & Sons, Inc. ISBN: 0471353817 Pub Date: 03/01/99 |
Previous | Table of Contents | Next |
Twofish has been designed from the start with performance in mind. It is efficient on a variety of platforms: 32-bit CPUs, 8-bit smart cards, and dedicated VLSI hardware. More importantly, though, Twofish has been designed to allow several layers of performance tradeoffs, depending on the relative importance of encryption speed, key setup, memory use, hardware gate count, and other implementation parameters. The result is a highly flexible algorithm that can be implemented efficiently in a variety of cryptographic applications.
All these options are interoperable; these are simply implementation tradeoffs and do not affect the mathematics of Twofish. One end of a communication could use the fastest Pentium II implementation, and the other the cheapest hardware implementation.
Table 5.1 gives Twofishs performance, encryption or decryption, for different key scheduling options and on several modern microprocessors using different languages and compilers. This table shows our results for many different implementations. Each implementation is presented on a single line. The first column gives the CPU the implementation was run on (PPro/II = Pentium Pro/Pentium II, U-SPARC = Ultra-SPARC, PPC = Power PC). The second column is the programming language (ASM = assembly language, MS C = Microsoft Visual C++ 4.2, BC = Borland C 5.0, C = standard C compiler). The keying options are explained below. The code size column contains the approximate total code size (in bytes) of the routines for encryption, decryption, and key setup, where available. All remaining numbers in the row are in clock cycles. For each key size we show the number of clock cycles required for the key setup, and the number of clock cycles required to encrypt a single block. The times for encryption and decryption are identical in assembly, and encryption is slightly slower than decryption in C; only the encryption (i.e., the larger) number is given. There is no time required to set up the algorithm except for key setup. The time required to change a key is the same as the time required to set up a key.
For example, on a Pentium Pro a fully optimized assembly-language version of Twofish can encrypt or decrypt data in 258 clock cycles per block, or 16.1 clock cycles per byte, after a 12700-clock key setup (equivalent to encrypting 45 blocks). On a 200 MHz Pentium Pro microprocessor, this translates to a 90 Mbits/sec.
Processor | Lang | Keying Option | Code Size | Clocks to Key | Clocks to Encrypt | ||||
---|---|---|---|---|---|---|---|---|---|
128 | 192 | 256 | 128 | 192 | 256 | ||||
PPro/II | ASM | Comp. | 9000 | 8600 | 11300 | 14100 | 258 | 258 | 258 |
PPro/II | ASM | Full | 8500 | 7600 | 10400 | 13200 | 315 | 315 | 315 |
PPro/II | ASM | Part. | 10700 | 4900 | 7600 | 10500 | 460 | 460 | 460 |
PPro/II | ASM | Min. | 13600 | 2400 | 5300 | 8200 | 720 | 720 | 720 |
PPro/II | ASM | Zero | 9100 | 1250 | 1600 | 2000 | 860 | 1130 | 1420 |
PPro/II | MS C | Full | 11200 | 8000 | 11200 | 15700 | 600 | 600 | 600 |
PPro/II | MS C | Part. | 13200 | 7100 | 9700 | 14100 | 800 | 800 | 800 |
PPro/II | MS C | Min. | 16600 | 3000 | 7800 | 12200 | 1130 | 1130 | 1130 |
PPro/II | MS C | Zero | 10500 | 2450 | 3200 | 4000 | 1310 | 1750 | 2200 |
PPro/II | BC | Full | 14100 | 10300 | 13600 | 18800 | 640 | 640 | 640 |
PPro/II | BC | Part. | 14300 | 9500 | 11200 | 16600 | 840 | 840 | 840 |
PPro/II | BC | Min. | 17300 | 4600 | 10300 | 15300 | 1160 | 1160 | 1160 |
PPro/II | BC | Zero | 10100 | 3200 | 4200 | 4800 | 1910 | 2670 | 3470 |
Pentium | ASM | Comp. | 9100 | 12300 | 14600 | 17100 | 290 | 290 | 290 |
Pentium | ASM | Full | 8200 | 11000 | 13500 | 16200 | 315 | 315 | 315 |
Pentium | ASM | Part. | 10300 | 5500 | 7800 | 9800 | 430 | 430 | 430 |
Pentium | ASM | Min. | 12600 | 3700 | 5900 | 7900 | 740 | 740 | 740 |
Pentium | ASM | Zero | 8700 | 1800 | 2100 | 2600 | 1000 | 1300 | 1600 |
Pentium | MS C | Full | 11800 | 11900 | 15100 | 21500 | 630 | 630 | 630 |
Pentium | MS C | Part. | 14100 | 9200 | 13400 | 19800 | 900 | 900 | 900 |
Pentium | MS C | Min. | 17800 | 3800 | 11100 | 16900 | 1460 | 1460 | 1460 |
Pentium | MS C | Zero | 11300 | 2800 | 3900 | 4900 | 1740 | 2260 | 2760 |
Pentium | BC | Full | 12700 | 14200 | 18100 | 26100 | 870 | 870 | 870 |
Pentium | BC | Part. | 14200 | 11200 | 16500 | 24100 | 1100 | 1100 | 1100 |
Pentium | BC | Min. | 17500 | 4700 | 12100 | 19200 | 1860 | 1860 | 1860 |
Pentium | BC | Zero | 11800 | 3700 | 4900 | 6100 | 2150 | 2730 | 3270 |
U-SPARC | C | Full | 16600 | 21600 | 24900 | 750 | 750 | 750 | |
U-SPARC | C | Part. | 8300 | 13300 | 19900 | 930 | 930 | 930 | |
U-SPARC | C | Min. | 3300 | 11600 | 16600 | 1200 | 1200 | 1200 | |
U-SPARC | C | Zero | 1700 | 3300 | 5000 | 1450 | 1680 | 1870 | |
PPC 750 | C | Full | 12200 | 17100 | 22200 | 590 | 590 | 590 | |
PPC 750 | C | Part. | 7800 | 12200 | 17300 | 780 | 780 | 780 | |
PPC 750 | C | Min. | 2900 | 9100 | 14200 | 1280 | 1280 | 1280 | |
PPC 750 | C | Zero | 2500 | 3600 | 4900 | 1030 | 1580 | 2040 | |
68040 | C | Full | 16700 | 53000 | 63500 | 96700 | 3500 | 3500 | 3500 |
68040 | C | Part. | 18100 | 36700 | 47500 | 78500 | 4900 | 4900 | 4900 |
68040 | C | Min. | 23300 | 11000 | 40000 | 71800 | 8150 | 8150 | 8150 |
68040 | C | Zero | 16200 | 9800 | 13300 | 17000 | 6800 | 8600 | 10400 |
Table 5.1. Twofish Performance with Different Key Lengths and Options |
Previous | Table of Contents | Next |